Document relevance calculation based on Lexical cohesion with structure analysis

نویسندگان

  • Yuming Zhao
  • Bingquan Liu
  • Xiaolong Wang
چکیده

This paper explores the feasibility of constructing a document relevance calculating model based on lexical cohesion with structure analysis. In this model, by extracting the semanticrelative word clusters in documents according to the lexicon cohesion principle, documents are formalized in expressions which are composed of lexicon chains with structure information. And based on this kind of representation, document relevance calculation is substituted by semantic distance calculation of lexical chains. The feasibility of this novel approach has been examined by experiments conducted on Chinese Library Classification (CLC) dataset. The results show that the method makes good use of the background knowledge of ordinary users, and it is an effective method for relevance calculation of documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On document relevance and lexical cohesion between query terms

Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexica...

متن کامل

A Study of Document Relevance and Lexical Cohesion between Query Terms

Lexical cohesion is a property of text, achieved through lexicalsemantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Experim...

متن کامل

Lexical Chain Based Cohesion Models for Document-Level Statistical Machine Translation

Lexical chains provide a representation of the lexical cohesion structure of a text. In this paper, we propose two lexical chain based cohesion models to incorporate lexical cohesion into document-level statistical machine translation: 1) a count cohesion model that rewards a hypothesis whenever a chain word occurs in the hypothesis, 2) and a probability cohesion model that further takes chain ...

متن کامل

Term Relationships and their Contribution to Text Semantics and Information Literacy through Lexical Cohesion

An analysis of linguistic approaches to determining the lexical cohesion in text reveals differences in the types of lexical semantic relations (term relationships) that contribute to the continuity of lexical meaning in the text. Differences were also found in how these lexical relations join words together, sometimes with grammatical relations, to form larger groups of related words that some...

متن کامل

Lexical cohesion and term proximity in document ranking

We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms. The proposed methods rely solely on the lexical relationships between original query terms, and do not involve query expansion or relevance feedback. Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Chinese Language and Computing

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2008